A novel entropy-based dynamic data placement strategy for data intensive applications in Hadoop clusters
نویسندگان
چکیده
منابع مشابه
Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments
Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...
متن کاملData Placement Strategy for Hadoop Clusters
Wireless technology has become very widely used; and an array of security measures, such as authentication, confidentiality strategies, and 802.11 wireless communication protocol based security schemas have been proposed and applied to real-time wireless networks. However, most of the measures only consider security issues in static mode, in which security levels are all configured when wireles...
متن کاملA Data Placement Algorithm for Data Intensive Applications in Cloud
Data layout is an important issue which aims at reducing data movements among data centers to improve the efficiency of the entire cloud system. This paper proposes a dataintensive application oriented data layout algorithm. It is based on hierarchical data correlation clustering and the PSO algorithm. The datasets with fixed location have been considered, and both the offline strategy and the ...
متن کاملAn Improved Data Placement Strategy in a Heterogeneous Hadoop Cluster
Hadoop Distributed File System (HDFS) is designed to store big data reliably, and to stream these data at high bandwidth to user applications. However, the default HDFS block placement policy assumes that all nodes in the cluster are homogeneous, and randomly place blocks without considering any nodes’ resource characteristics, which decreases self-adaptability of the system. In this paper, we ...
متن کاملEntropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Big Data Intelligence
سال: 2019
ISSN: 2053-1389,2053-1397
DOI: 10.1504/ijbdi.2019.097395